9 research outputs found

    Gerador de Estímulos para Teste de Circuitos Integrados

    Get PDF

    Soft Error Effects on Arm Microprocessors: Early Estimations versus Chip Measurements

    Get PDF
    Extensive research efforts are being carried out to evaluate and improve the reliability of computing devices either through beam experiments or simulation-based fault injection. Unfortunately, it is still largely unclear to which extend fault injection can provide an accurate error rate estimation at early stages and if beam experiments can be used to identify the weakest resources in a device. The importance and challenges associated with a timely, but yet realistic reliability evaluation grow with the increase of complexity in both the hardware domain, with the integration of different types of cores in an SoC (System-on-Chip), and the software domain, with the OS (operating system) required to take full advantage of the available resources. In this paper, we combine and analyze data gathered with extensive beam experiments (on the final physical CPU hardware) and microarchitectural fault injections (on early microarchitectural CPU models). We target a standalone Arm Cortex-A5 CPU and an Arm Cortex-A9 CPU integrated into an SoC and evaluate their reliability in bare-metal and Linux-based configurations. Combining experimental data that covers more than 18 million years of device time with the result of more than 176,000 injections we find that both the SoC integration and the presence of the OS increase the system DUEs (Detected Unrecoverable Errors) rate (for different reasons) but do not significantly impact the SDCs (Silent Data Corruptions) rate which is solely attributed to the CPU core. Our reliability analysis demonstrates that even considering SoC integration and OS inclusion, early, pre-silicon microarchitecture-level fault injection delivers accurate SDC rates estimations and lower bounds for the DUE rates

    Entendendo o impacto de integração de núcleos e sistema operacional sobre a confiabilidade em sistemas baseados em ARM

    No full text
    Reliability has become one of the main issues for computing devices employed in several domains. This concern only deepens with the increase of integration in the same chip of several peripherals and accelerators. To evaluate computational system reliability, fault injection and radiation experiments are used. Fault injection in microarchitectural models of the processor provides deep insights on faults propagation through the entire system stack, including the operating system. Beam experiments, on the other hand, estimate the device’s expected soft error rate in realistic physical conditions by exposing it to the accelerated particle beam. Combining beam experiments and fault injection data can deliver deep insights about the device’s expected reliability when deployed in the field. However, it is yet largely unclear if the fault injection error rates can be compared to those reported by beam experiments and how this comparison can lead to informed soft error protection decisions in early stages of the system design. In this work, first, the data gathered with extensive beam experiments (on physical CPU hardware) and microarchitectural fault injections (on an equivalent CPU model on Gem5) performed with 13 different benchmarks executed on top of Linux on an ARM Cortex-A9 microprocessor are presented and analyzed. We then compare the soft error rate estimations that are based on neutron accelerated beam and fault injection experiments. We show that, for most benchmarks, fault injection can be very accurately used to predict the Silent Data Corruptions (SDCs) rate and the Application Crash rate. The System Crash rate measured with beam experiments, however, is much larger than the one estimated by fault injection due to unknown proprietary parts of the physical hardware platform that can’t be modeled in the simulator. Overall, our analysis shows that the relative difference between the total error rates of the beam experiments and the fault injection experiments is limited within a narrow range of values and is always smaller than one order of magnitude. This narrow range of the expected failure rate of the CPU provides invaluable assistance to the designers in making effective soft error protection decisions in early design stages. After that, the impact of cores integration and the OS interference on the reliability of Arm microprocessors is also analyzed and quantified. But in this analysis besides the same Arm Cortex-A9, as used in the previous analysis, a standalone Arm Cortex-A5 is also tested with both neutron beam and microarchitecture-level fault injections (on equivalent CPU models of the A5 and A9 CPUs on Gem5 simulator). Correlating the beam experiments to the fault injection results it was found that due to the peripherals and interfaces, the integration of various cores significantly increases the System Crash rates but has a negligible impact on the SDC rate which is attributed to the CPU cores. Moreover, the OS has a beneficial impact on the Application Crashes but not on the System Crashes nor the SDC rates. The results of this second analysis firmly confirm, on two different CPU cores, the initial findings and speculations from the first analysis that the SDC part of the overall system failure rate is minimally affected by the SoC integration and the existence of the OS, while the Crashes parts are more severely affected by both aspects. The findings can be employed to support diligent design decisions for CPU cores error protection at the hardware or software level.A confiabilidade se tornou um dos principais problemas em dispositivos de computação empregados em vários domínios. Essa preocupação apenas se aprofunda com o aumento da integração no mesmo chip de vários periféricos e aceleradores. Para avaliar a confiabilidade de um sistema computacional, são utilizados experimentos de injeção de falhas e de radiação. A injeção de falhas em modelos microarquiteturais do processador, por um lado, fornece informações detalhadas sobre a propagação de falhas em todo fluxo do sistema, incluindo o sistema operacional. Os experimentos com radiação, por outro lado, estimam a taxa de erro mais proximo de condições físicas realistas, expondo-o a fluxos acelerados de partículas. A combinação de experimentos de radiação e dados de injeção de falhas pode fornecer informações profundas sobre a confiabilidade esperada do dispositivo quando implantado em campo. No entanto, ainda não está claro se as taxas de erro de injeção de falha podem ser comparadas com as relatadas por experimentos com radiação e como essa comparação pode levar a decisões concientes sobre proteção de erros nos estágios iniciais do projeto de um sistema. Neste trabalho, primeiro são apresentados e analisados, os dados coletados com extensos experimentos de radiação (no hardware físico da CPU) e injeções de falhas microarquiteturais (em um modelo de CPU equivalente no Gem5) realizadas com 13 benchmarks diferentes executados no Linux em um microprocessador ARM Cortex-A9. Em seguida, comparamos as estimativas de taxa de erro leve baseadas em experimentos de radiação de nêutrons e injeção de falhas. Mostramos que, para a maioria dos benchmarks, a injeção de falhas pode ser usada com muita precisão para prever a taxa de SDCs (Silent Data Corruptions) e a taxa de falha do aplicativo. A taxa de falha do sistema medida com experimentos de radiação, no entanto, é muito maior que a estimada por injeção de falha devido a partes proprietárias desconhecidas da plataforma de hardware físico que não podem ser modeladas no simulador. No geral, nossa análise mostra que a diferença relativa entre as taxas de erro total dos experimentos de radiação e as experiências de injeção de falha é limitada dentro de uma faixa estreita de valores e é sempre menor que uma ordem de magnitude. Esse intervalo estreito da taxa de falhas esperada da CPU fornece assistência inestimável aos projetistas na tomada de decisões eficazes de proteção contra erros desoftware nos estágios iniciais do projeto. Depois disso, o impacto da integração dos núcleos e a interferência do Sistema Operacional na confiabilidade dos microprocessadores Arm também são analisados e quantificados. Mas nessa segunda análise, além do mesmo Arm Cortex-A9 usado na análise anterior, um Arm Cortex-A5 também é testado com injeções de falha no nível de microarquitetura (em modelos de CPU equivalentes dos processadores A5 e A9 no Gem5 simulador) e na radiação de nêutrons. Correlacionando os experimentos de radiação com os resultados da injeção de falhas, verificou-se que, devido aos periféricos e outras interfaces, a integração aumenta significativamente as taxas de falha do sistema, mas tem um impacto insignificante na taxa de SDC atribuída aos núcleos da CPU. Além disso, o sistema operacional tem um impacto benéfico nos travamentos de aplicativos, mas não nos travamentos do sistema nem nas taxas de SDC. Os resultados desta segunda análise confirmam firmemente, em dois núcleos diferentes de CPU, as descobertas e especulações iniciais da primeira análise de que a parte SDC da taxa geral de falhas do sistema é minimamente afetada pela integração do SoC e pela existência do sistema operacional, enquanto os Crashes são mais severamente afetadas por ambos os aspectos. Ambas descobertas podem ser empregadas para apoiar decisões de projeto com o objetivo minimizar a taxa de erros tanto no nível de hardware quanto no de software

    Gerador de padrões de teste para células sequenciais

    No full text
    The validation of standard cell libraries used on digital integrated circuit design is a crucial task. However, the validation of sequential logic gates is quite complex due to the inherent memory effect found in these devices. In this work, it is proposed a generic test pattern generator to be applied on the validation of sequential cells. This generator is expected to be independent of the cell under test behavior, to change only one input per step and to be cyclic. To solve the problem, it is necessary to model this problem as a graph and find an Euler cycle over it. In order to find a cycle it is proposed the use of a modified Depth-first search. First the generator is validated using behavioral description of several different sequential cells. Also, is is validated using several different topologies. It is also proposed and analyzed the possibility of implementation on hardware.A validação de bibliotecas de Standard Cell usadas no design de circuitos integrados é uma tarefa crucial. No entanto, a validação dos portas seqüenciais é bastante complexa devido ao efeito de memória presente nestes dispositivos. Neste trabalho, propõe-se um gerador de padrões de teste genérico a ser aplicado na validação de células seqüenciais. Espera-se que este gerador seja independente do comportamento da célula sob teste, para alterar apenas uma entrada por etapa e seja cíclico. Para resolver o problema, é necessário modela-lo como um grafo e encontrar um ciclo de Euler sobre ele. Para encontrar um ciclo, propõe-se o uso de uma pesquisa por profundidade modificada. Primeiro, o gerador é validado usando a descrição comportamental de várias células seqüenciais diferentes. Também é validado usando várias topologias diferentes. Também é proposto e analisado a possibilidade de implementação em hardware

    Gerador de Estímulos para Teste de Circuitos Integrados

    Get PDF

    Gerador de padrões de teste para células sequenciais

    No full text
    The validation of standard cell libraries used on digital integrated circuit design is a crucial task. However, the validation of sequential logic gates is quite complex due to the inherent memory effect found in these devices. In this work, it is proposed a generic test pattern generator to be applied on the validation of sequential cells. This generator is expected to be independent of the cell under test behavior, to change only one input per step and to be cyclic. To solve the problem, it is necessary to model this problem as a graph and find an Euler cycle over it. In order to find a cycle it is proposed the use of a modified Depth-first search. First the generator is validated using behavioral description of several different sequential cells. Also, is is validated using several different topologies. It is also proposed and analyzed the possibility of implementation on hardware.A validação de bibliotecas de Standard Cell usadas no design de circuitos integrados é uma tarefa crucial. No entanto, a validação dos portas seqüenciais é bastante complexa devido ao efeito de memória presente nestes dispositivos. Neste trabalho, propõe-se um gerador de padrões de teste genérico a ser aplicado na validação de células seqüenciais. Espera-se que este gerador seja independente do comportamento da célula sob teste, para alterar apenas uma entrada por etapa e seja cíclico. Para resolver o problema, é necessário modela-lo como um grafo e encontrar um ciclo de Euler sobre ele. Para encontrar um ciclo, propõe-se o uso de uma pesquisa por profundidade modificada. Primeiro, o gerador é validado usando a descrição comportamental de várias células seqüenciais diferentes. Também é validado usando várias topologias diferentes. Também é proposto e analisado a possibilidade de implementação em hardware

    The Impact of SoC Integration and OS Deployment on the Reliability of Arm Processors

    No full text
    Arm CPU architectures, thanks to their efficiency and flexibility, have been widely adopted in portable user devices such as smartphones, tablets, and laptops. Recently, the high computing efficiency, together with the unique possibility that Arm offers to adapt the architecture for a specific application, pushed the adoption of Arm-based systems both in HPC (High Performance Computing) applications and autonomous vehicles. The possibility of modifying Arm architecture can potentially be extremely beneficial, as selective fault tolerance solutions can be added at the microarchitectural level. The current trend in the design of computing devices is to integrate several functionalities on the same SoC (System-On-Chip). Modern SoCs usually integrate (one or more) CPUs and (one or more) accelerators, such as GPUs (Graphics Processing Units) or FPGAs (Field Programmable Gate Arrays). These SoCs typically allow the computing cores to share common memories, which significantly improves the performance and reduces the total power consumption but may impact the system's reliability. In this work, we have evaluated the impact of SoC integration and OS deployment using beam experiments and microarchitectural fault injection

    Demystifying Soft Error Assessment Strategies on ARM CPUs: Microarchitectural Fault Injection vs. Neutron Beam Experiments

    No full text
    Fault injection in early microarchitecture-level simulation CPU models and beam experiments on the final physical CPU chip are two established methodologies to access the soft error reliability of a microprocessor at different stages of its design flow. Beam experiments, on one hand, estimate the devices expected soft error rate in realistic physical conditions by exposing it to accelerated particles fluxes. Fault injection in microarchitectural models of the processor, on the other hand, provides deep insights on faults propagation through the entire system stack, including the operating system. Combining beam experiments and fault injection data can deliver deep insights about the devices expected reliability when deployed in the field. However, it is yet largely unclear if the fault injection error rates can be compared to those reported by beam experiments and how this comparison can lead to informed soft error protection decisions in early stages of the system design. In this paper, we present and analyze data gathered with extensive beam experiments (on physical CPU hardware) and microarchitectural fault injections (on an equivalent CPU model on Gem5) performed with 13 different benchmarks executed on top of Linux on an ARM Cortex-A9 microprocessor. We combine experimental data that cover more than 2.9 million years of natural exposure with the result of more than 80,000 injections. We then compare the soft error rate estimations that are based on neutron beam and fault injection experiments. We show that, for most benchmarks, fault injection can be very accurately used to predict the Silent Data Corruptions (SDCs) rate and the Application Crash rate. The System Crash rate measured with beam experiments, however is much larger than the one estimated by fault injection due to unknown proprietary parts of the physical hardware platform that can't be modeled in the simulator. Overall, our analysis shows that the relative difference between the total error rates of the beam experiments and the fault injection experiments is limited within a narrow range of values and is always smaller than one order of magnitude. This narrow range of the expected failure rate of the CPU provides invaluable assistance to the designers in making effective soft error protection decisions in early design stages

    Candida bloodstream infections in intensive care units: analysis of the extended prevalence of infection in intensive care unit study

    No full text
    Item does not contain fulltextOBJECTIVES: To provide a global, up-to-date picture of the prevalence, treatment, and outcomes of Candida bloodstream infections in intensive care unit patients and compare Candida with bacterial bloodstream infection. DESIGN: A retrospective analysis of the Extended Prevalence of Infection in the ICU Study (EPIC II). Demographic, physiological, infection-related and therapeutic data were collected. Patients were grouped as having Candida, Gram-positive, Gram-negative, and combined Candida/bacterial bloodstream infection. Outcome data were assessed at intensive care unit and hospital discharge. SETTING: EPIC II included 1265 intensive care units in 76 countries. PATIENTS: Patients in participating intensive care units on study day. INTERVENTIONS: None. MEASUREMENT AND MAIN RESULTS: Of the 14,414 patients in EPIC II, 99 patients had Candida bloodstream infections for a prevalence of 6.9 per 1000 patients. Sixty-one patients had candidemia alone and 38 patients had combined bloodstream infections. Candida albicans (n = 70) was the predominant species. Primary therapy included monotherapy with fluconazole (n = 39), caspofungin (n = 16), and a polyene-based product (n = 12). Combination therapy was infrequently used (n = 10). Compared with patients with Gram-positive (n = 420) and Gram-negative (n = 264) bloodstream infections, patients with candidemia were more likely to have solid tumors (p < .05) and appeared to have been in an intensive care unit longer (14 days [range, 5-25 days], 8 days [range, 3-20 days], and 10 days [range, 2-23 days], respectively), but this difference was not statistically significant. Severity of illness and organ dysfunction scores were similar between groups. Patients with Candida bloodstream infections, compared with patients with Gram-positive and Gram-negative bloodstream infections, had the greatest crude intensive care unit mortality rates (42.6%, 25.3%, and 29.1%, respectively) and longer intensive care unit lengths of stay (median [interquartile range]) (33 days [18-44], 20 days [9-43], and 21 days [8-46], respectively); however, these differences were not statistically significant. CONCLUSION: Candidemia remains a significant problem in intensive care units patients. In the EPIC II population, Candida albicans was the most common organism and fluconazole remained the predominant antifungal agent used. Candida bloodstream infections are associated with high intensive care unit and hospital mortality rates and resource use
    corecore